“Wanna come over watch a movie and chill?” Everything was going really well for Jessica and David: They were on their third date watching a movie at her place. Things took a turn for the worse when they tried to pick a movie to watch. They thought they were signing up to watch a comedy, but what they got was a drama. Things didn’t work out too well for them.

IMDb genre categorization often conflicts with how a movie is marketed and what audiences expect. We look at this disparity and its impact on how movies are perceived.

Production companies spend time and effort controlling how a movie is marketed, in order to attract the attention of certain audiences and increase sales for the movie. After a movie’s release its input into IMDb is based on an editor’s opion and attempts to appeal to as broad of audiences as possible. We will be looking at a couple of the most popular movies over the past sixty years to see if IMDb has categorized them the same way they were marketed when they were first released.

To help determine what movies we would be analyzing over the past 80 years, we found the movie with the highest number of recorded votes, up to the time that the data was collected for SQL.

sql_1940_data <- db %>%
dbGetQuery("SELECT t.title, t.production_year, 
mi.movie_id,
mii.info_type_id, mii.info 
FROM title t
LEFT JOIN movie_info mi ON mi.movie_id = t.id
LEFT JOIN movie_info mi2 ON mi2.movie_id = t.id
LEFT JOIN movie_info_idx mii ON mii.movie_id = t.id
WHERE kind_id = 1 AND production_year = 1940
  AND mi.info_type_id = 1 AND mi2.info = 'USA'
  AND mii.info_type_id = 100
  AND mii.info > 100000
  ORDER BY mii.info DESC;")
sql_1960_data <- db %>%
dbGetQuery("SELECT t.title, t.production_year, 
mi.movie_id,
mii.info_type_id, mii.info 
FROM title t
LEFT JOIN movie_info mi ON mi.movie_id = t.id
LEFT JOIN movie_info mi2 ON mi2.movie_id = t.id
LEFT JOIN movie_info_idx mii ON mii.movie_id = t.id
WHERE kind_id = 1 AND production_year = 1960
  AND mi.info_type_id = 1 AND mi2.info = 'USA'
  AND mii.info_type_id = 100
  AND mii.info > 100000
  ORDER BY mii.info DESC;")
sql_1980_data <- db %>%
dbGetQuery("SELECT t.title, t.production_year, 
mi.movie_id,
mii.info_type_id, mii.info 
FROM title t
LEFT JOIN movie_info mi ON mi.movie_id = t.id
LEFT JOIN movie_info mi2 ON mi2.movie_id = t.id
LEFT JOIN movie_info_idx mii ON mii.movie_id = t.id
WHERE kind_id = 1 AND production_year = 1980
  AND mi.info_type_id = 1 AND mi2.info = 'USA'
  AND mii.info_type_id = 100
  AND mii.info > 100000
  ORDER BY mii.info DESC;")
sql_2000_data <- db %>%
dbGetQuery("SELECT t.title, t.production_year, 
mi.movie_id,
mii.info_type_id, mii.info 
FROM title t
LEFT JOIN movie_info mi ON mi.movie_id = t.id
LEFT JOIN movie_info mi2 ON mi2.movie_id = t.id
LEFT JOIN movie_info_idx mii ON mii.movie_id = t.id
WHERE kind_id = 1 AND production_year = 2000
  AND mi.info_type_id = 1 AND mi2.info = 'USA'
  AND mii.info_type_id = 100
  AND mii.info > 100000
  ORDER BY mii.info DESC;")
sql_genre_data <- db %>%
dbGetQuery("SELECT mi.info, 
t.title, t.production_year, t.id
FROM title t
JOIN movie_info mi ON t.id = mi.movie_id
WHERE mi.info_type_id = 3
AND t.kind_id = 1;")
movie_genre_info <- sql_genre_data %>%
  filter(id == 4361365 | id == 4118523 | id == 4260164| id == 3649367)
genres_info <- movie_genre_info %>%
  group_by(production_year, title) %>%
  summarize(Genre = paste(info, collapse = ','))

[We didn’t think that graphing the genres for each movie would be more beneficial to the audience than just showing it as is. A graph for information this straight forward seemed counter-productive. We used kable to create mini tables illustrating the genres in each of the movies. However, the caption was not showing up, so we had to create titles for each.]

genres_info %>%
  head(4) %>%
  kable(caption = "IMDb Recorded Genres", 
        col.names = c("Production Year", "Movie", "Genres"), 
        align = "l",
        width = "2")
IMDb Recorded Genres
Production Year Movie Genres
1940 The Great Dictator Comedy,Drama,War
1960 Psycho Horror,Mystery,Thriller
1980 Star Wars: Episode V - The Empire Strikes Back Action,Adventure,Fantasy,Sci-Fi
2000 Gladiator Action,Adventure,Drama

When going to the movies, sometimes what you are expecting is very different than what you get. Audiences had grown to love Charlie Chaplin as the Tramp. He portrayed a drastically different character for his 1940 feature “The Great Dictator”.

“Once the horrors of the Holocaust began to be known, Hitler was no longer funny, not at all.”

“The Great Dictator” was released in 1940 and was Charlie Chaplin’s first talking picture as well as the highest-grossing of his career. America was in WWII from 1939-1945; our involvement in the war had just started when the movie was released. Chaplin said in his 1964 biography, “Had I known of the actual horrors of the German concentration camps, I could not have made The Great Dictator, I could not have made fun of the homicidal insanity of the Nazis.”

Within the first few seconds of the original trailer for the Great Dictator we can see that the movie is a satire. Chaplin’s comedic chops are used to bring light to the atrocities happening in Germany. The Great Dictator is listed in IMDb under the genres comedy, drama, and war which while not innaccurate is a complete oversight of the film’s propaghanda, political satire, and commentary about war.

http://www.tcm.com/mediaroom/video/159742/Great-Dictator-The-Original-Trailer-.html

“The Great Dictator” was banned in many countries for its content and message which encompasses a larger range than its listing under “comedy” warrents.

banned in Argentina

banned in Argentina

The shock of audiences continues with our next feature. The most voted on and highest rated movie in IMDb for 1960 is Alfred Hitchcock’s Psycho. Much ado was made over the original marketing for Alfred Hitchcock’s Psycho in 1960:

“The manager of this theatre has been instructed, at the risk of his life, not to admit any persons after the picture starts.”1

(From an article in the Guardian in 1960)

1960s Psycho ad in the New York Times

1960s “Psycho” ad in the New York Times

The “slasher” genre was started with Psycho and its no-holds-barred knife weilding; this is also not mentioned in the IMDb categorization. Who will forget that famous shower scene?

Alfred Hitchcock’s Psycho is listed in IMDb under the genres horror, thriller, and mystery. However, most of Psycho’s original marketing had to do with suspense; after all Hitchcock was coined the Master of Suspense.

The Daily News had this to say in their original 1960 review:

“Hitch has done it again…the suspense builds up slowly but surely to an almost unbearable pitch of excitement. “Psycho” is a murder mystery. It isn’t Hitchcock’s usual terrifier, a shocker of the nervous system; it’s a mind-teaser.”

Movie goers in the 60s were noting Psycho’s suspense and psychological terror, yet IMDb doesn’t categorize the movie in that way. We had to search original reviews and marketing materials to get a sense of what audiences at the time were experiencing.

As we approach more modern times, the categorization of movies in IMDb becomes more accurate. For our next analysis we look at the ever-popular Star Wars franchise.

“He Will Join Us, Or Die!”

Darth Vader is not mincing words when it comes to the fifth installment of Star Wars: The Empire Strikes Back:

https://www.youtube.com/watch?v=fg9MsitOLh4

The Empire Strikes Back was the highest grossing film of 1980, continuing the success of George Lucas’ series. It is listed in IMDb as Scifi, Fantasy, Action, and Adventure and we would largely agree. Star Wars is an established franchise where there is little ambiguity or miscategorization.

In looking at some of the original marketing material we saw a range of stragegies, including this poster that tries to highlight the romance between Princess Leia and Han Solo:

One of the original posters

One of the original posters

Thankfully IMDb does not categorize Star Wars as a romance! Audiences knew what they were in for and IMDb correctly categorizes Star Wars.

Lastly, we look at a movie that left audiences puzzled. Christipher Nolan’s Momento received enough votes on IMDb to make it one of the most popular and well liked releases of 2000.

“I can’t make new memories”

explains Leonard Shelby (Guy Pearce), the vengeance-seeking insurance investigator at the center of ‘’Memento,’’ Christopher Nolan’s ingenious new thriller. This is the opening line of Roger Ebert’s original 2001 review of Momento.

Momento is listed in IMDb as a mystery and thriller, and we would largely agree with this categorization.

This movie poster gives one of the movies taglines: “Some Memories Are Best Forgotten”

An alternate hypothesis:

We believe that the functionality of IMDb as a website limits its ability to register movies outside of the more commonly used genres. We found this to be true in that the older movies we analyzed like The Great Dictator and Psycho were lacking in genre categorization while the newer releases were relatively accurate.

Movie watching is a classic past-time. From watching a film with your family to picking just the right movie for that hot date you are excited about. While the categorization of movies isn’t necessarily at the forefront of people’s minds, it is helpful to look at how things are catagloged and the accuracy of databases such as IMDb and their historical accuracy.

github


  1. Taken from: *footnotes/citations with links